LISP1.OLD[LSP,JRA]12 - www.SailDart.org

perm filename LISP1.OLD[LSP,JRA]12 blob sn#169897 filedate 1975-07-25 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00018 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	.SEC(LISP - a high-level mathematical language for data structures,Introduction)
C00010 00003	.REQUIRE "NOTES.INT" SOURCE_FILE
C00012 00004	.Sec(Symbolic expressions,Symbolic expressions)
C00024 00005	.<<Examples: s-exprs not sexprs>>
C00027 00006	.SS(Binary trees)
C00030 00007	.GROUP
C00031 00008	.REQUIRE "PROB1.PUB" SOURCE_FILE
C00032 00009	.SS(Primitive Functions)
C00035 00010	We have two unary selector functions, %3car%* and %3cdr%*, for traversing binary trees.
C00046 00011	.SS(Conditional Expressions,conditional expression)
C00050 00012	%3
C00056 00013	Here's an intuitive description of such a function (predicate) named %3equal%*.
C00062 00014	.REQUIRE "PROB2.PUB" SOURCE_FILE
C00063 00015	.SS(List Notation,lists,P100:)
C00074 00016	.REQUIRE "PROB4.PUB" SOURCE_FILE
C00075 00017	.SS(%3list%* and %3null%*)
C00080 00018	.SS(A respite)
C00086 ENDMK
C⊗;
.SEC(LISP - a high-level mathematical language for data structures,Introduction)

.BEGIN INDENT 20,20;
"... it is important not to lose sight of the fact that there is a difference
between training and education.  If computer science is a fundamental discipline,
then university education in this field should emphasize enduring fundamental
principles rather than transient current technology."
.END
.BEGIN TURN ON"→";
→Peter Wegner, %3Three Computer Cultures%*
.END

This text is nominally about LISP and data structures, however it in fact
covers much broader areas of computer science.
There are two reasons for this more ambitious venture. First the author has long 
felt that the  beginning student of computer science has been getting a distorted 
and disjointed picture of the field. In some ways this confusion is natural; the
field has been growing at such an enourmous rate that few of us are prepared
to be judged experts in all areas of the discipline.  The current alternative
seems to be to give a few introductory courses in programming and machine organization
followed by relatively specialized courses in more technical  areas. 
The difficult with this approach is that much of the technical material
never gets related. The student's perspective and motivation suffers in the
process. This book uses LISP as a means for relating topics which normally
get treated in several  separate  courses. The point is not that we %2can%*
do this in LISP, but rather that it is %2natural%* to do it in LISP.
The high-level notation for algorithms is beneficial in explaining and understanding
complex algorithms.
The use of abstract data structures and abstract LISP programs shows the
true intent of structured programming and step-wise refinement. Much
of the current work in mathematical theories of computation is based on LISP-like
languages. Thus LISP is a formalism for describing algorithms, for writing programs, and
for proving properties of algorithms.
We use data structures as the main thread in our discussions because 
a proper appreciation of data structures as abstract objects is a necessary
prerequisite to an understanding of modern computer science.

Just as it is unnecessary to learn machine language to study numerical
algorithms, it is also unnecessary to learn machine language to understand
non-numerical processes.  The distinction we are making is between data
structures and ⊗→storage structures↔←.  That is, ⊗→data structures↔← are independent
of %6how%* they are implemented on a machine.  Data structures are representations
of information chosen to exhibit certain ordering and accessibility relationships
between data items.  Certainly in the real world you cannot ignore
storage structures when you are deciding upon the data structures which
will encode your problem, but the interesting aspects of representation of
information can be discussed at the level of data structures with no loss
of generality.  The mapping of data structures to storage structures is
machine dependent anyway, and consists of bit-pushing, trickery and black
magic.  We are more interested in ideas than coding tricks.

A very crucial problem in data structures and algorithms is the
interplay between the representation and the algorithm:  if you pick a
frugal representation perhaps your accessing algorithms become more time
consuming or the algorithms become less general.  We will study this
interrelation carefully in this text. We will also see that it is possible,
and most beneficial, to structure our programs such that there is a very
clean interface between the abstract algorithm and the chosen representation.
That is, there will be a set of representation-manipulating programs to test,
select or construct elements of the domain; and there will be a program
encoding the algorithm. To change representations only requires changes
to constructors, selectors and predicates, not to the basic program.

.REQUIRE "NOTES.INT" SOURCE_FILE;

Finally a note on the structure of the text. 
The emphasis  flows from the abstract to the specific, beginning
with a precise description of the domain of LISP functions
and the operations defined over that domain, and  moves to a discussion
of the details of efficient implementation of LISP-like languages.
The practical-minded programmer might be put-off but the "irrelevant"
theory and the theoretical-minded mathematician might be put-off
by the "irrelevant" details of implementation. If you lie somewhere in between
these two extremes, then welcome.

.Sec(Symbolic expressions,Symbolic expressions)

An investigation of any discipline will profit from precision.
A precise description of what is being studied, and what methods
are to be applied is crucial in 
any study, whose final arbiter 
is a machine.
This book is a study of data structures and programming languages.
How are we to proceded? How do we introduce rigor into a field
which is as %3ad hoc%* and diverse as programming.
For guidance let's look at mathematics. 

One of the more fertile, yet easily introduced areas of mathematics, is that of
elementary number theory. It is easy to introduce because
everyone "knows" the integers.
Basically this field studies properties of a certain class of operations definable
over the set of non-negative integers called %dN%*.
A very formal presentation might begin with a construction of %dN%*
from more primitive notions, but it is usually assumed that the reader
is familiar with the basic properties of %dN%*. 
In either case the next step would be to delimit the
class of operations which we would allow ⊗↓For example, certain
methods of analysis are not allowable in elementary number theory.←;
finally we would begin the study %3per se%*.

We shall begin our study of LISP in a similar manner, 
as an investigation of a certain
class of operations definable over  a domain of objects.
Our objects are called  %2Symbolic Expressions%*.
Our domain of  Symbolic Expressions is named %d<sexpr>%*.
⊗→Symbolic expressions↔← are also known as  %6⊗→S-expressions↔←%*
or %6⊗→S-exprs↔←%*.  
After we present a description of %d<sexpr>%*, we introduce  operations
on %d<sexpr>%*.


Though everyone "knows" the integers, we are perhaps not so fortunate when it
comes to Symbolic expressions. We must define them. Let's look to mathematics again
for help. If we were forced to define the integers, then our definition
would depend on how well we "know" them.

For a layman, a characterization as a sequence of decimal digits might
be satisfactory.

If pressed for details we might attempt a characterization like the following:

.BEGIN GROUP;TABIT1(15);
\%21.%* %30%* is a number

\%22.%* If %3n%* is a number then %2S%3(n)%1, the "successor" of %3n%*, is a number.

\%23.%* The only numbers are those created by %21%* and %22%*.
.END

If we were inclined toward set theory, we might attempt to develop
the non-negative integers from more primitive set-constructions.


Two points should be apparent here. First we must stop %2somewhere%* in our
explication. We must assume that the questioner knows %2something%*:
"digit", %30%*, or "set".
Secondly, and by far more important, there is a different level of characterization
between the "layman" and the "successor" above.

The "layman" characterization is essentially syntactic.
The explication tells us nothing about the interrelationships between the
numbers. You might argue that the "successor" notation is also 
syntactic, only uglier. %2S%3(%2S%3(0))%1 is just a terrible way to write %32%*.
One benefit of the "successor" notation is that it explicitly shows the
means of construction. That is, it shows more of the properties of the
numbers than just distinguishability.
It would be better to think of the positional notation of the "layman"
as simply a notational convenience for the "successor" description.

But notation and syntax are necessary and we must be able to give precise
descriptions of syntactic notions. First, let's recast the layman's 
description in a style similar to that of "successor".

.BEGIN GROUP;TABIT1(15);

\%21.%* A number is a digit, or

\%22.%* if %3n%* is a number then %3n%* followed by a digit is a number.

\%23.%* The only numbers are those created by %21%* and %22%*.
.END

Notice we assume that "digit" is understood, and that "followed by" is
juxtaposition.
In words, "a number is a digit or a number followed by a digit".
Using a style of syntax specification
called BNF (Backus-Naur Form) equations we can give a more concise
description that this.

.BEGIN TABIT1(15);

<number>\::= <digit>
<number>\::= <number><digit>

.END
Here, the symbol "::=" is to be read "is";
the strings beginning with "<" and ending with ">" derive from the
references to "number" and "digit" in %21%1 and %22%*.
and the sequence "><" is to be read "followed by".
As an abbreviation, the  two BNF equations may also be written:
.BEGIN TABIT1(15);

<number>\:: = <digit>|<number><digit>,

where the symbol "|" is to be read "or";
.END


To begin the study of LISP we should therefore characterize the domain
of Symbolic Expressions at a similar degree of precision.
We will do so  in two steps. First
a primitive domain is described, then we show how to augment that domain to 
arrive at the full  set of Symbolic Expressions.


Our primitive domain, %d<atom>%*,
consists of the  %2⊗→literal atoms↔←%* and the class of signed 
integers. The following BNF equations describe the syntax we will use
for elements of %d<atom>%*.

.GROUP SKIP 2;
.BEGIN TABIT1(15);
.GROUP;
<atom>\:: = <literal atom>|<number>| -<number>
<literal atom>\:: = <atom letter>|<literal atom><atom letter>|<literal atom><digit>
<number>\:: = <digit>|<number><digit>
<atom letter>\:: =%3 A |B |C ...| Z%*
<digit>\:: = %30 |1 |2 ... |9%*
.APART;
.END;
Thus a literal atom is a string of uppercase letters and digits, subject
to the provision that the %2first%* character in the atom be a letter.
Literal atoms are also known as %2identifiers%*.


.GROUP SKIP 2;
.BEGIN TABIT2(20,35);GROUP;
For example:\atoms\not atoms
%3\ABC123\2a
\12\a
\A4D6\$$g
\NIL\ABD.
\T\(A . B)
%*
.group skip 1;
.END;

Most of our discussions will deal with non-numeric atoms. When viewed as
data structures, ⊗→numbers↔← have no properties not possessed by literal atoms.
Most implementations
of LISP do however contain a large arithmetic entourage, including 
floating point numbers and operations. Many implementations also
give a wider class of literal atoms, allowing some special characters to appear;
for most of our discussion the above class is quite sufficient.

Using a LISP operator to be described later, we are able to augment
the domain, %d<atom>%*, obtaining the domain %d<sexpr>%*.
Thus %d<sexpr>%* includes %d<atom>%* as a proper subset. 
These non-atomic S-expressions are  called %2⊗→dotted-pairs↔←%* and
can be written as follows: 
a left-parenthesis followed by an
S-expr, followed by a period, followed by an S-expr,
followed by a right-parenthesis is also an S-expr.
Here's a BNF description of the full set of S-expressions.
.BEGIN TURN ON "←";
.P47:

←<sexpr> :: = <atom>|%3(%*<sexpr> . <sexpr>%3)

.END;

Notice that if we allow floating point numbers  as atoms some care needs to be
exercised when writing S-expressions. How should %3(A.1.2)%* be interpreted.
Is it the dotted pair %3(A . 1.2)%*, or is it just an ill-formed expression?
Evaluation of such ambiguous constructs will depend on the implementation; such
details do not interest us yet.

.<<Examples: s-exprs not sexprs>>
.BEGIN TABIT2(20,40);
.GROUP;

Examples:\S-exprs\not S-exprs
%3
\A\A . B
\(A . B)\(A . B . C)
\(((A.B) .C) . (A.B))\((A . B)))
%*
.APART;
.END;
Recall  our caveat on syntax and numbers. It also applies here. When we
described  the domain, %d<sexpr>%*, 
 we picked a %2specific%* syntactic representation for its elements.
It will be a
convenient notation since it makes explicit  the construction of the composite
<sexpr> from
its components, ⊗↓Just as the "successor" notation shows  the  construction
of  the numbers from %30%*. This kind of  notation will  be much more useful
in LISP, since our interest in data structures will  focus on the 
construction process and the interrelationships between components of a
<sexpr>.←
and the  notation is also consistent with LISP history.

However there is more to the domain, %d<sexpr>%*, than  syntax;
just as there is more to %dN%* than positional notation
⊗↓%32%*, II in Roman numerals, 10 in binary, ... are all 
representations of the number two.←.
What %2are%* the essential features of s-expressions? 
Symbolic expressions are either
atomic or they have two components. 
If we  are confronted with a non-atomic s-expression
then we want a means of distinguishing between the
"first" and the "second" component. 
The "dot notation" does this for us, but 
obviously "(", ")", and "."  of the dotted-pairs are simply
notation or syntax.  We could have just as well represented
the  dotted-pair of %3A%* and %3B%*
as the set-theoretic ordered pair, %3<A,B>%*, or
any other notation which preserves the essentials of the domain %d<sexpr>%*.
We will try to say more about the essence of "dotted-pairs" and
data structures, in general, as we proceed.
.SS(Binary trees)
Besides the more conventional typographical notations, s-expressions
also have interesting %2graphical%*  representations.
S-exprs have a natural
interpretation as non-empty %6⊗→binary trees↔←%*.  
A non-empty binary tree is a branching structure 
consisting of a single root node and perhaps a collection of terminal
and non-terminal nodes.  From non-terminal nodes we sprout exactly two
branches; no branches are grown from the terminal nodes.  And most
important:  there are no intersecting branches.  We will talk about more
general structures later (called list-structures or directed graphs).
In our discussions, an <atom> will be attached to each terminal node, and
non-terminal nodes will be unlabeled.
.GROUP SKIP 1;
.GROUP;
Here are some binary trees:

.BEGIN TABIT2(5,40);SELECT 3;

					.		   
.GROUP SKIP 8;
\1    2       A   D   E\A    B    NIL

.END


.APART;
%1
Perhaps you can see how to interpret S-exprs now.  The atoms are
interpreted as terminal nodes;  and since non-atomic S-exprs always
have two sub-expressions we can write the first
sub-expression as the left branch of a binary tree and the second sub-
expression as the right branch.  
.GROUP SKIP 2;
.GROUP;
For example:

.BEGIN TABIT2(10,40);SELECT 3;

\(A . B)\(A . (B . C))
.GROUP SKIP 5;
\A   B\A    B    C

.APART;


.GROUP;
%1or:%*

\((A . B) . C)\(A . (B . NIL))

.GROUP SKIP 8;
\A     B         C\ A     B     NIL
.APART;
.END
%1
.P102:
Other representations of binary trees which will be more suggestive when
we talk about implementation of LISP are:
%3
.GROUP
.BEGIN NOFILL;




(A .(B . C)) %1is:%*       A                                %1or%*    A            B         C



                              B    C



.END;
.APART
.SELECT 1;
.REQUIRE "PROB1.PUB" SOURCE_FILE;
.SS(Primitive Functions)
%1
So much for the domain of objects;  what we need now are some functions
or operators to perform operations on the domain.  
What is a function? The logician  Alonzo Church, says a function is simply a
mapping such that for any given argument in the domain of the function there
exists a corresponding value. No rule of computation need be given to help
locate values. However, as you might expect, rules of computation will be
an important part of our discussion and noting the differences between the
mathematical or logical concept of function and the related idea of 
procedure or computable function will be one of our interests.
For the moment it suffices to think of LISP functions in the usual
mathematical sense.

The first LISP function we consider is the
⊗→%3cons%*↔← (constructor) function which is used to generate s-exprs from less
complicated s-exprs.  %3cons%* is a ⊗→total function↔←, that is, it is defined for all
s-expr arguments. It is a function of two arguments and has as value a
new s-expr interpreted as a binary tree with left branch as the first argument
and with right branch, the second argument.  For example:
 
.BEGIN CENTERIT;
←%3cons[A; B] = (A . B)
←cons[(A . B); C] = ((A . B) .C)
.END;
%1

.GROUP SKIP 1;
We have two unary selector functions, %3car%* and %3cdr%*, for traversing binary trees.
They are both ⊗→partial functions↔←;
that is, they are functions which are %2⊗→undefined↔←%* for some arguments
⊗↓How "undefined" manifests itself on  a machine will depend on the implementation.
Sometimes error messages are given; sometimes it results in an excursion into
the subconscious. Shortly ({yonss(P85)}) we will show that a precise discussion
of partial functions and the "undefined" value can be given.←.
These LISP functions are defined only for non-atomic arguments.
⊗→%3car%*↔← returns as value the first subexpression of its (composite) argument; ⊗→%3cdr%*↔← 
(pronounced could-er) returns as value the second sub-expression of its
argument.  For example:
.GROUP SKIP 1;
.BEGIN CENTERIT;
.GROUP;
←%3car[(A . B)] = A        car[A] %*is undefined.
←%3cdr[(A . B)] = B       cdr[(A .(B . C))] = (B . C)
←car[((A . B) . C)] = (A . B)
%*
.APART;
.END;
.GROUP SKIP 1;
.GROUP;
As with most mathematical theories, we will allow ⊗→functional composition↔←.
The composition of two unary functions %3f%* and %3g%* is another
function, sometimes denoted by %3f⊗g%*. The value of an expression, %3f⊗g[x]%*,
is the value of %3f[g[x]]%*.
That is, the value of %3f⊗g[x]%* is a %3z%* 
such that %3y%* is the value of %3g[x]%*
and %3z%* is the value of %3f[y]%*. %3f⊗g%* may be undefined for many reasons:
for example,
%3g[x]%* may be undefined, or %3f[y]%* may be undefined.
For example:

.GROUP SKIP 1;
.BEGIN CENTERIT;
%3
←car⊗cdr[(A .(B . C))] = car[cdr[(A .(B . C))]] = car[(B . C)] =  B 
←cdr⊗cdr[(A .(C . B))] = cdr[cdr[(A .(C . B))]] = cdr[(C . B)] =  B
←cdr[cdr[A]] = %1is undefined%*
←car[cdr[(A . B)]] = %1is undefined%*
←car[cons[x;A]] = x     cdr[cons[Y;y]] = y .
%*
.END;
.APART

Notice that in the last two examples we have introduced variables (over
S-exprs).  In the sequel lower-case letters (or lower-case identifiers) will be
used freely as  variables.  So for example %3Y%* is an atom; %3y%* is a variable.

The composition of many %3car%* and %3cdr%* functions occurs so frequently that an
abbreviation has been developed.  
Given such a composition, we select  in left-to-right order, 
the relevant %3a%*s and %3d%*s in the %3car%*s and %3cdr%*s. We sandwich this
string of %3a%*s and %3d%*s between a left-hand %3c%* and a right-hand %3r%*
and give the composition this name.

.BEGIN CENTERIT;GROUP;
For example:
%3
←cadr[x] <= car[cdr[x]]
←caddr[x] <= car[cdr[cdr[x]]]
←cdar[x] <= cdr[car[x]]
%*
.END;
These compositions are also called ⊗→%3car-cdr-%2chains%1↔←, and are useful in
traversing binary trees. The "<="- notation is to be read "is defined to be the 
function ..."
This notation is only a temporary convenience and not part of LISP.
Very soon we will study what is involved in giving and using definitions in LISP
({yonss(P49)}).
For the moment intuition should suffice.

.BEGIN TURN ON "#";
Mathematically speaking, a composition of functions is simply another function
--#i.e.,#a mapping#-- and therefore nothing need be said about how to compute
composed functions. From a more pragmatic view, we want to  express evaluation
of expressions involving composed functions in terms of the evaluation of
subexpressions.
Perhaps the most natural way to evaluate expressions involving compositions is
to evaluate the inner-most expressions first, then work outwards. 
Assume arguments to multi-argument
functions may be evaluated in any order. Thus:
.END

.BEGIN SELECT 3;GROUP;CENTERIT;

←cons[car[(A . B)];cdr[(A . (1 . 2))]] = cons[A;cdr[(A . (1 . 2))]] = cons[A;(1 . 2)] = (A .(1 . 2))
←%1or:%*  = cons[car[(A . B)];(1 . 2)] = cons[A;(1 . 2)] = (A .(1 . 2))
.END

.P106:
Evaluation seems to be quite a simple operation but
looks can be deceiving; it can be a very complex process.
Since computations may not terminate, the value of an expression
may depend on the %2order%* in which we do things. For example
in the evaluation of %3foo[t%41%*;t%42%*]%1 where %3foo[x;y]#<=#y%1,
we may run into difficulties if we insist on evaluating %3t%41%1.
The computation of the value of %3t%41%1 may not terminate,
or it may involve an undefined expression, like taking %3car%* of
an atom for example.
In either case,
we lose even though it is reasonable to believe that 
the value of the expression should not depend on %3t%41%1.


Or consider the seemingly innocous example:

.BEGIN CENTERIT;SELECT 3;
←car[cons[x;y]] =x.
It is a simple variant to:
←car[cons[x;A]] =x
.END
The second equation holds for all values of %3x%*. 
It is an identity. Even if %3x%* is undefined,
it holds. Even if the computation of the value of %3x%* does not terminate,
the equation holds.
Neither side terminates.  We are not so fortunate in the first example. For
if the computation of the argument %3y%* runs into difficulties, the left-hand
computation will not terminate, while the right-hand side will be
defined if %3x%* is defined. We will meet more computational difficulties later.

Non-terminating computations bring up a further point. What should
be the "value" of a non-terminating computation? We will say that
forms whose evaluation involve such anti-social computations will
be undefined. Thus a computation may be "undefined" for two reasons:
it involves a non-terminating computation; it involves applying a 
partial function to a value not in its domain.
Suffice it to say that we must be careful.

The syntax of the LISP expressions (or forms) allowed so far is given below.

.BEGIN TABIT1(13);GROUP;
.P66:

<form>\::= <constant> | <function>[<arg>; ... <arg>] | <function>[] | <variable>

<constant>\::= <sexpr>               (see {yon(P47)}.)

<function>\::= <identifier>

<arg>\::= <form>

<variable>\::= <identifier>

<identifier>\::= <letter> | <identifier><letter> | <identifier><digit>

<letter>\::= %3a | b | ... | z  

.END
We will frequently violate these syntax equations, allowing function names
containing special characters, e.g. %3fact*%*, %3fib%41%1 or + ; or writing %3x +y%*
instead of %3+[x;y]%*.  No attempt will be made to characterize these violations;
occurrences of them should be clear from context.

Notice that the class <⊗→form↔←> is a collection of LISP expressions which can
be evaluated; either constants, function name followed by arguments, or a 
variable which should have a current binding or value. Again we will wait
to {yonss(P49)} for a precise description.

.P111:
An important restraint on LISP forms which is not covered by the syntax
equations is the requirement that functions are defined as being n-ary for
some %2fixed%* n. n-ary functions
must have %2exactly%* n arguments presented to them whenever they are applied.
.P95:
Thus %3cons[A], cons[A;B;C], %*and %3car[A;B]%* are all ill-formed expressions
and are therefore undefined.

.CENT(Problems)

I. Discuss %3cons[car[x];cdr[x]] = x%1.
.SS(Conditional Expressions,conditional expression)
We cannot generate a very exciting theory based simply on %3car, cdr,%* and %3cons
%*
with functional composition. Before we can write reasonably interesting 
algorithms we must have some way of performing conditional actions.
To do this we need ⊗→predicates↔←. A LISP predicate will simply be a function
returning a value representing truth or falsity.  Since our functions
return Sexprs as values we must represent truth and falsity as Sexprs.
We will take the atoms ⊗→%3T%*↔← and ⊗→%3NIL%*↔← to represent ⊗→true↔← and ⊗→false↔←, respectively.
Notice that since LISP predicates %2are%* functions, they may be freely 
composed with other functions.

LISP has two primitive predicates.
The first is a total predicate named ⊗→%3atom%*↔←.  It is a predicate
of one argument, returning %3T%* or %3NIL%* depending on whether or not the value of that
argument is atomic.
.GROUP SKIP 1;
.BEGIN CENTERIT;
.GROUP
%3
←atom[A] = atom[NIL] = T
←atom[atom[A]] = T = atom[atom[(A . B)]]
%*
.APART
.END

The second primitive predicate is named ⊗→%3eq%*↔←.  It is a binary predicate, defined
only if the values of its arguments are both atomic.  It returns %3T%* if the arguments evaluate to the
same atom; it returns %3NIL%* otherwise.
.BEGIN CENTERIT;
%3
.GROUP SKIP 1;
.GROUP
←eq [A;A] = T			eq [A;B] = NIL
←eq [(A . B); A] %* is undefined, as is    %3eq [(A . B);(A . B)]
←eq [eq [A;B];eq [C;D]] = T
%*
.APART;
.END

The IF-THEN-ELSE operation in LISP is called the conditional expression.  It
is written:
.BEGIN CENTERIT;
%3
←[p%41%* → e%41%*; p%42%* → e%42%*  ... ; p%4n%*  → e%4n%*]
%1
.END

.P40:
The meaning (or semantics) of conditionals is:  Each %3p%4i%1 is a predicate; 
each %3e%4i%1
is an expression.  We evaluate the %3p%4i%1 's from left to right, finding the %2first
%*
which returns value %3T%*.  When we find such a %3p%4i%1 , we evaluate the corresponding
%3e%4i%1.  The value of the conditional expression is the value computed by %3e%4i%1; if none
of the %3p%4i%1's are true then the conditional expression is undefined.  The
conditional expression is also undefined if we come across a %3p%4i%1
which is undefined.  
.BEGIN CENTERIT;
.GROUP;
Examples:

%3
←[atom [A] → B; eq [A;(A . B)] → C] = B
←[eq [A;(A . B)] → C; atom [A] → B] is undefined
←[atom [(A . B)] → B; eq [A ; B] → C; eq [car[(A . B)]; cdr[(B . A)]] → E] = E
%1
.APART
.END
Notice that the p%42%* expression of the first example is undefined, but
the conditional gives value %3B%* since p%41%* gives value %3T%*.


In applications of conditional expressions it is often convenient to be
assured that the conditional always is defined.  To make sure that at least one of
the %3p%4i%1's is true we can pick a predicate for %3p%4n%1, (the last predicate in the 
conditional) which is always true.  You can think of lots of predicates which
are always true ( for example, %3eq [1;1]%* ).  A natural predicate is the
constant predicate, %3T%*.  Thus:
.GROUP

.BEGIN CENTERIT;

%3
←[p%41%*  → e%41%*; T → e%42%*]
%1
.END
can be read:  If %3p%41%1  is true then  %3e%41%1  else %3e%42%1.  (or otherwise %3e%42%1)

.APART;
.BEGIN TURN ON "#";
.P17:
Some care has to be given to the defintion of the meaning of a conditional expression.
First the order of evaluation %2is%* important. 
Consider: %3f[x;y]#<=#[y#=0#→#0;#T#→#x/y]%*. If we insisted on evaluating all of the
constituents of the conditional first, then %3f[ ...;0]%* would be undefined, rather
than giving value %30%* as our current interpretation would.

There are other interpretations of equal merit. Consider the nonsense definition:
%3g[x;y]#<=#[lic[x]#→#1;T#→#1]%*. Clearly, assuming that %3lic%* is a total predicate, 
any value computed by %3g%* will be %31%*. But requiring  left-to-right evaluation
could spend a great deal of unnecessary computation if %3lic%* is a 
%2l%*ong %2i%*nvolved %2c%*alculation.

Questions of evaluation are non-trivial. You should at least be aware that some
decisions have been made and others were possible.
.P88:
With conditional expressions, we are seeing the first major step away from
traditional mathematical thought. We may not continue to think of conditional
expressions as a notational convenience for evaluation of some mathematical 
expression, say:
.BEGIN CENTERIT; SELECT 3;

←[p%41%* → e%41%*; p%42%* → e%42%*] %1is%3 cond[p%41%*; e%41%*; p%42%*; e%42%*]
.END
where %3cond%* is a function. There are several difficulties: 
certainly the order of evaluation cannot be adequately expressed but perhaps the
most serious involves the observation that arguments to %3cond%* would frequently
be undefined. Examine %3[eq[A;B]#→#C;#eq[car[A];B]#→#D]]%* and
%3cond[eq[A;B];C;eq[car[A];B];D]%*. However, in {yonss(P85)} we will show that
a reasonable mathematical interpretation can be given to LISP conditional 
expressions.
.END

A word to the wise.  It doesn't seem like you can do much damage with such
a limited collection of operations.  %6Do not be deceived!%*  In elementary number
theory all you have is zero and some simple functions, and elementary number theory
is far from elementary.
Manipulation of our primitives, with composition, and conditional expressions, 
coupled with techniques for definition can %2also%* become complicated.

For example: our predicate %3eq%* is defined only for atomic arguments.  We would like
to be able to test for equality of arbitrary sexprs.  
By equality we mean: as trees, the sexprs have the same branching structure; and
the corresponding terminal nodes are labeled by the same atoms.
Thus, we would
like to define a predicate, ⊗→%3equal%*↔←, such that:
.GROUP SKIP 1;
.GROUP
.BEGIN CENTERIT;
%3
←equal [(A . B);(A . B)] = T = equal [A;A]
←equal [(A . B);(B . A)] = NIL 
←equal [(A . (B . C));(A . (B . C))] = T

%1
.END
.APART

Here's an intuitive description of such a function (predicate) named %3equal%*.

.BEGIN INDENT 0,4;
1.  If both arguments are atomic then see what %3eq%* says about them
(are they "%3eq%*").  We can test if they are both atomic by using %3atom%*
and a conditional expression.
.END

2.  If one is atomic and the other is not they can't be equal s-exprs.

.BEGIN INDENT 0,5;
3.  Otherwise both are non-atomic sexprs.  Both have two sub-expressions.
Look at both first subexpressions.  If the first sub-expressions are not
equal then certainly the initial expressions cannot hope to be equal.  
If, however, the first subexpressions are equal then the question of
whether or not the initial expressions are equal depends on the
equality (or non-equality) of the second subexpressions.  Thus the
following definition:
.END
%3
.GROUP SKIP 1;
.BEGIN TABIT2(19,30);GROUP;
.P1:
     equal[x;y] <=\[atom[x] → [atom[y] → eq [x;y];
\\ T→ NIL];
\ atom[y] → NIL;
\ equal [car[x];car[y]] → equal[cdr[x];cdr[y]];
\ T → NIL]

.END
%1
Notice that we use nested conditional expressions in %3equal%*; e%41%*
is itself a conditional. Also we have used predicates in the e%4i%* positions
at e%43%* and e%411%*; this is perfectly reasonable since %3equal%* is a predicate
and thus returns either %3NIL%* or %3T%*. 

Let's show that %3equal%* does perform correctly for a specific example.  This will
also show a complicated evaluation of a conditional expression.

.BEGIN TABIT2(25,44);GROUP;SELECT 3;

equal[(A . B);(A . C)] =\[atom[(A . B)] → [atom[(A . C)] → eq [(A . B);(A . C)];
\\ T→ NIL];
\ atom[(A . C)] → NIL;
\ equal [car[(A . B)];car[(A . C)]] → equal[cdr[(A . B)];cdr[(A . C)]];
\ T → NIL]
.APART
.BEGIN FILL;SELECT 1;
Now using the meaning of conditionals ({yon(P40)}),
we find that p%41%* (i.e.,#%3atom[(A#.#B)]%*#)
and p%42%* (#%3atom[(A#.#C)]%*#) when evaluated (in order) give %3NIL%*.
We must now evaluate p%43%*: %3equal[car[(A#.#B)];car[(A#.#C)]]%*.
This reduces to %3equal[A;A]%*, and:
.END

.GROUP;
   equal[A;A] =\[atom[A] → [atom[A] → eq[A;A];
\\ T→ NIL];
\ atom[A] → NIL;
\ equal [car[A];car[A]] → equal[cdr[A];cdr[A]];
\ T → NIL]
.APART
.BEGIN FILL;SELECT 1;
This conditional expression will finally evaluate to %3T%*. So p%43%* in the
original call of %3equal[(A#.#B);(A#.#C)]%* is true, and we are faced with the evaluation of e%43%*
which is %3equal[cdr[(A#.#B);cdr(A#.#C)]]%*. After evaluation of the arguments and 
evaluation of the conditional expression defining %3equal%* we will finally
return value %3NIL%*. That is, %3(A . B)%* and %3(A . C)%* are %2not%* equal.
Clearly evaluation of LISP expressions in this great detail is not a process
which we wish to do very often. 

Notice that throughout this example expressions like %3atom[(A#.#B)]%*
or %3eq[(A#.#B);(A#.#C)]%* were appearing but were never evaluated because
of the way in which we defined evaluation of conditionals.
.END
.END
Finally, to include conditional expressions in our syntax of LISP
expressions, we should add:

.BEGIN TABIT1(11);

<form>\::= [<form> → <form>; ... <form> → <form>]         (see {yon(P66)})

.END

As usual these syntax equations fail to capture all of our intended
meaning. The <form>s appearing in the p%4i%*-position are to be forms
evaluating to either %3T%* or %3NIL%* or are to be undefined.

.REQUIRE "PROB2.PUB" SOURCE_FILE;
.REQUIRE "PROB3.PUB" SOURCE_FILE;

.SS(List Notation,lists,P100:)

.BEGIN TURN ON "#";
In many applications of LISP it is convenient to deal with a representation for
sequences.  A sequence is  an ordered set of elements.
For example, %3(x%41%*, x%42%*, x%43%*)%1, is standard notation for 
a sequence of  the three elements, x%41%*, x%42%*, and x%43%*.
We will use "%3()%*" as notation for the empty sequence.
We will allow sequences to have
sub-sequences to an arbitrary depth.
Thus %3(A,#(B,#C),#D,#(E,#B))%* is a sequence of length four, whose second
and fourth elements are also sequences.  
We will want to write LISP functions operating  over sequences, so we will
need to give constructors, selectors and predicates for sequences and
ultimately will need to give a representation of these sequence manipulating operations
as operations on Sexpressions.

What are the essential characteristics of a sequence? First, a sequence
is either empty or it has elements. Thus we will want a predicate to test for emptyness.
Next, if the sequence is non-empty, we should be able to select elements. Finally,
given some elements, we should be able to build a new sequence from them.
The LISP terminology for a sequence is a %2list%*. Thus our sequence-constructor
is called ⊗→%3list%*↔←. 

.BEGIN CENTERIT;SELECT 3;
←list[x%41%*; x%42%*; ...; x%4n%*] = %3(x%41%*, ..., x%4n%*)%1

The basic predicate, which tests for emptyness, is called ⊗→%3null%*↔←.

.BEGIN TABIT2(10,20);
\%3null[x]%1 is \%3T%* if %3x%* is the empty sequence, %3()%*.
\\%3NIL%* if %3x%* is a non-empty sequence.
.END

.BEGIN FILL
Thus %3null%* is a partial predicate, defined only for sequences.
We will name the predicate which recognizes lists, ⊗→%3islist%*↔←.
.END

←%3islist[(A, B, C] = T
←%3islist[A] = NIL%1
←%3islist[()] = T%1

.BEGIN FILL;
Thus %3islist%* is a total predicate.
The element selectors for a (non-empty) sequence are %3first%*, %3second%*, ... ,etc. .
.END

←%3first[(A, B, C)] = A
←%3second[(A, B, C)] = B
←%3third[(A, B)]%1 is undefined

.BEGIN FILL
It is also convenient to define an "all-but-first" selector, called ⊗→%3rest%*↔←.
.END

←%3rest[(A, B, C)] = (B, C)
←%3rest[(B, C)] = (C)
←%3rest[(C)] = ()
←%3rest[C] = %1is undefined

.BEGIN FILL
and, in conjunction with %3rest%*, we shall utilize a constructor, ⊗→%3concat%*↔←,
which is to  add a single element to the front of a list.
.END
←%3concat[A;(B,C)] = (A, B, C)
←%3concat[A;()] = (A)
←%3concat[(A);(B,C)] = ((A), B, C)
←%3concat[(B,C);A] = %1is undefined

.END

Notice that we have been describing %3list%*, %3null%*, and the selector
functions without regard to their underlying representation as functions
which manipulate Sexpressions.
We have said nothing about these list-operations except that they
construct, test, or select.
These operations %2will%* be represented as LISP functions, and thus
we must supply %2some%* representation for sequences.

.END
.GROUP
The choice we make is represented graphically as follows:
%3
.BEGIN VERBATIM


(X, Y, Z)  ≡                                                      ≡ (X  . (Y  . (Z  . NIL)))  

                  X               Y               Z      NIL

.END
.APART
%1
Or:
.GROUP SKIP 6;
.BEGIN VERBATIM;SELECT 3;
                 X           Y          Z          NIL
.END

That is, the right-hand branch in this binary tree representation of a sequence
will always point to the rest of the sequence or will be the atom %3NIL%*.
You should become fluent in translating between
s-expr notation and sequence notation.
Since the more usual LISP terminology for a sequence is a list,  we
will refer usually to the  above as %2⊗→list-notation↔←%*. 

Some people
restrict lists so that elements of a list are either atomic or are lists themselves.
We shall not do so. Elements of a list may be arbitrary sexprs;
thus %3(A, (A . B), C)%* is a list of three elements.

A final point:  what about the empty sequence or empty list? Looking at the
graphical interpretation of a sequence it appears natural to take ⊗→%3NIL%*↔← as
the ⊗→empty list↔← since after you have removed all the elements from the list,
%3NIL%*, the list terminator is all that is left. 
To be 
consistent  then, we will represent %3( )%* as the atom %3NIL%*.  And
a notational point:  in graphical notation it is often convenient to write
.GROUP
.BEGIN TABIT2(10,30);

\\as                  .  

\       %3NIL%*
.END
.APART
%1
.BEGIN TURN ON "←";TABIT3(20,30,40);
.GROUP
Thus, for example:

←%3(A, (B, C), D)%* is:



%3
\A\\    D



\\   B\   C

%1
.APART
.GROUP
or:
%3

←A








←B    C    NIL     D          NIL 
%1
.END
or, in "dotted-pair" notation: %3 (A .((B .(C . NIL)).(D . NIL)))%1.
.APART
Finally, in list-notation the commas can be replaced by spaces.
.BEGIN CENTERIT;
%3
←e.g. (A, (B, C), D) ≡ (A(B C)D).
%1
.END
Note:  the "dots" in dot-notation are %2never%* optional.

Let's take stock of our position: We have an intuitive understanding
of what we mean by "sequence". We have described selectors, a constructor,
and a predicate, albeit at an abstract level, for manipulating
sequences. We have  represented our notion of sequences
as sexpressions. Clearly the final step is to represent our
list-manipulators as certain LISP functions.
Now LISP functions, expect their inputs to be sexpressions and expect
their outputs to be sexpressions. Thus, to be precise, 

.BEGIN CENTERIT;SELECT 3;
←%3list[A; (A, B); (C .D)]%1 
is not well-formed. %3(A, B)%* is %2not%* an sexpression.

To be correct, using our representation, we %2should%* have written:
←%3list[A; (A .(B . NIL)); (C . D)]%*

Similarly, the value of this expression should be written as:
←%3(A .((A .(B . NIL)).((C . D) . NIL)))%1

instead of:
←%3(A (A B) (C . D))%1 or %3(A, (A, B), (C . D))%*.
.END

Clearly it is more reasonable to read and write in list notation.
As long as we perform only list-operations on lists there is 
no reason to look at the underlying dotted-pair representation.
However it must be remembered that list operations are carried out on the
machine using the dotted-pair representation. The list notation, then,
is only a convenient abbreviation. %2We%* might carry out the "list-to-dotted-pair"
 transformations implicitly, but a machine which evaluates LISP
expressions will have to have an explict transformation mechanism.
.REQUIRE "PROB4.PUB" SOURCE_FILE;
.SS(%3list%* and %3null%*)
.BEGIN TURN ON "←";
We finally can give representations for the list operations:

⊗→%3islist%*↔← can be defined as:

←%3islist[x] <= [atom[x] → eq[x;NIL]; T → islist[rest[x]] ]%*.

⊗→%3null%*↔← can then be defined as:

←%3null[x] <= [islist[x] → atom[x]; T → NIL].%1⊗↓ Strictly speaking, we
should make %3null%* undefined if %3x%* is not a sequence.←

Similarly, in our representation, the construction
of a list from an arbitrary number of s-expressions 
is an appropriately nested composition of %3cons%*es

⊗→%3list%*↔←%3[%1x%41%1; x%42%*; ... x%4n%*] generates a list consisting of the values
of the n arguments. That is, the effect is:

←%3cons[%1x%41%3;cons[%1x%42%3; ... cons[%1x%4n%3;NIL]] ...] %1.

.BEGIN GROUP;NOFILL;
Examples:
.select 3;
←list[A;B] = (A B)
←list[A;B;C] = (A B C)
←list[cons[A;B];car[(A . B)]] = ((A . B) A)
←list[A;list[B;C]] = list[A;(B C)] = (A (B C))
←list[] = NIL
←list[NIL] = (NIL)
.END

Notice that %3list%* is %2not%* a function in the LISP sense; it %2does%* evaluate its arguments,
but it can take an arbitrary number of them. See {yon(P111)}.
For the moment, %3list%* is simply a notational abbreviation for the nested %3cons%*.


The representation of the element-selector functions should be apparent from the
graphical representation. 
To retreive the n%8th%* element of a list, use %3cad%8n-1%*r%1.

Finally, our representation of sequences decrees that %3rest[x] <= cdr[x]%*.
The reader should discover the  representation of %3concat%*.

Thus we have added lists and list-notation to to the repertoire of LISP.
.END

Though our representation of sequences is such that %3first%*, %3rest%*
and %3concat%*
are %2identical%* to %3car%*, %3cdr%*, and %3cons%* respectively, we will
attempt to use the names %3first%*, %3rest%*, and %3concat%* whenever we wish
to make it clear that we are operating on lists.

To summarize the accomplishments of this section, we have  in
effect described a %2new%* data structure in LISP.
The description process included 

.BEGIN INDENT 0,10,10;

%21.  The abstract operations.%* We give constructors, selectors, and predicates for
the recognition of instances of the data structure.

%22.  The underlying representation.%* We must show how the new data
structure can be represented in terms of existing data structures.

%23.  Abstract operations as concrete functions.%* We must write LISP
functions which faithfully mirror the intended meaning of the abstract operations
when interpreted in the underlying representation.

%24.  The input/output transformations.%* We should give conventions
for transforming to and from the internal representation.

.END

.CENT(Problems)

I Discuss %3cons[x;cons[y;z]]%* as a representation for %3(x, y, z)%*.

.SS(A respite)
.BEGIN TURN ON "←";
This section contains some hints and notes on the introductory material of
this chapter. First a reiteration of a previous admonition: though
this material seems quite straightforward, the next chapter will begin
to show you that things are not all that trivial.  LISP is quite powerful.
The preceding material %2is%* basic and the sooner it becomes second nature
to you the better.  

****DO  general hints on graphing and counting parens***

When graphing `list notation' first discover how many elements are in the list.
This can be done simply by counting parentheses. 
Starting at the left-hand end of the list, mark the left parenthesis
with zero; proceeding to the right, increase the count by one when
encountering a left parenthesis, and decrease the count by one when
a right parenthesis is met. The elements of the list are those subexpressions
which either are delimited by parentheses marked "1" or are top-level atoms.
Once the elements are discovered,
draw the spine of a 
tree which will have as many  left branches as there are elements.  The right
branch %2always%* ends with %3NIL%*. Then examine the structure of each node.
Notice again that to find the n%8th%* element in a list, use %3cad%8n-1%*r%1.
Thus %3cdr%* is %2never%* the second element of a list. %3cadr%* is the second
element, %3cdr%* is the %2rest%* of the list.

When evaluating or writing functions or (predicates) %2always%* keep in mind
any  restrictions of the function:  is it partial? is it total? defined only
for lists? are there restrictions on arguments? When taking %3car%* or %3cdr%* is the
argument non-atomic?

A few tricks were embedded in the problem sets.
In {yonss(P11)}, recall problem %28%*. The composition %3atom[cons[ ...]]%*
will always evaluate to %3NIL%* ⊗↓%1If it has a value at all!
If the computation of the arguments to the %3cons%* does not
terminate (call-by-value) then we obviously won't get %3NIL%*.←
since the result of %3cons%*-ing is always
non-atomic.
In %210.%*, we used identifiers with  the same
letter strings as predicate names, %3ATOM%* and %3EQ%*. %3ATOM%* and %3EQ%*
are perfectly good identifiers, and are %2not%* the LISP predicates.
%214.%* shows that predicates are perfectly good expressions to evaluate; e%41%*
is a predicate. Similarly, in %216.%*, conditional expressions may appear within
functional composition.

.BEGIN TURN ON "#";
Notice that %3twist%* in II is total whereas %3findem%* is partial.
%3findem%* is partial since %3y%* must be atomic. Both functions build 
new trees, %3twistem%* reverses left- and right-branches recursively;
%3findem%* builds a tree with the same branching structure as %3x%*,
but the terminal nodes contain %3T%* at the points where the atom %3y%*
appears in the original tree, and %3NIL%* otherwise.
.P24:
Perhaps you also noticed in %3findem%* that the construction: %3atom[x]#→#eq[x;y]%*
could have been used instead of %3atom[x]#→#[eq[x;y]#→#T;#T#→#NIL]%*. Prove it.
.END

In {yonss(P12)} you should have discovered that the value of:

←%3cons[%1x%41%3,(%1x%42%*, ...x%4n%3)]%1 is: %3(%1x%41%*, x%42%*, ... x%4n%3).

%1Notice that %3list[%1x%41%3;(%1x%42%*, ... x%4n%3)]%1 is 
%3(%1x%41%3 (%1x%42%* ... x%4n%3))%1.
So %3cons%* will add a new element to the front of an existing list. %3list%*
will create a new list whose elements will be the values of the arguments
to %3list%*.

Be clear on the difference between the empty list, %3NIL%*, and 
the list consisting of %3NIL%*, %3(NIL)%*. %3(NIL)%* is 
an abbreviation for %3(NIL . NIL)%*, which certainly is not %3NIL%*.
List-notation is  an abbreviation and can always be translated back
into a sexpr.

.END